A Study on Information Resource Evaluation for Text Categorization
نویسندگان
چکیده
منابع مشابه
Evaluation of decision forests on text categorization
Text categorization is useful for indexing documents for information retrieval, ltering parts for document understanding, and summarizing contents of documents of special interests. We describe a text categoriza-tion task and an experiment using documents from the Reuters and OHSUMED collections. We applied the Decision Forest classiier and compared its accuracies to those of C4.5 and kNN class...
متن کاملAlgorithms for Text Categorization : A Comparative Study
Text Categorization’s significance is on a continuous acceleration due to the present mammoth escalation in textual data thrusting the importance of analysing and examining the methods for handling textual data. This paper discusses and compares six algorithms for Text categorization, such as: Naïve Bayes, Support Vector Machine, NGrams, K-Nearest Neighbourhood, Back Propagation Network and Gen...
متن کاملA Study of Text Preprocessing Tools for Arabic Text Categorization
Text preprocessing is an essential stage in text categorization (TC) particularly and text mining generally. Morphological tools can be used in text preprocessing to reduce multiple forms of the word to one form. There has been a debate among researchers about the benefits of using morphological tools in TC. Studies in the English language illustrated that performing stemming during the preproc...
متن کاملA Study on Mutual Information-based Feature Selection for Text Categorization
Feature selection plays an important role in text categorization. Automatic feature selection methods such as document frequency thresholding (DF), information gain (IG), mutual information (MI), and so on are commonly applied in text categorization. Many existing experiments show IG is one of the most effective methods, by contrast, MI has been demonstrated to have relatively poor performance....
متن کاملA Review on Categorization of Text Data Using Side Information
In today’s digital environment, text databases are rapidly increases due to use of internet and communication mediums. Different text mining techniques are used for knowledge discovery and Information retrieval. Text data contains the side information along with the text data. Side information may be the metadata associated with text data like author, co-author or citation network, document pro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of the Korean Society for information Management
سال: 2007
ISSN: 1013-0799
DOI: 10.3743/kosim.2007.24.4.305